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Abstract 

"  In  applying  active-set  methods  to  sparse  quadratic  programs,  it  is  desirable  to  utilize  existing 
sparse-matrix  techniques.  Vie  describe  a  quadratic  programming  method  based  on  the  classical 
Schur  complement.  Its  key  feature  is  that  much  of  the  linear  algebraic  work  associated  with  an 
entire  sequence  of  iterations  involves  a  fixed  sparse  factorization.  Updates  are  performed  at  every 
iteration  to  the  factorization  of  a  smaller  matrix,  which  may  be  treated  as  dense  or  sparse. 

The  use  of  a  fixed  sparse  factorization  allows  an  “^off-the  shelf^sparse  equation  solver  to  be 
used  repeatedly.  This  feature  is  ideally  suited  to  problems  with  structure  that  can  be  exploited 
by  a  specialized  factorization.  Moreover,  improvements  in  efficiency  derived  from  exploiting  new 
parallel  and  vector  computer  architectures  are  immediately  applicable. 

An  obvious  application  of  the  method  is  in  sequential  quadratic  programming  methods  for 
nonlinearly  constrained  optimization,  which  require  solution  of  a  sequence  of  closely  related 
quadratic  programming  subproblems.  AVe  discuss  some  ways  in  which  the  known  relationship 
between  successive  problems  can  be  exploited;  r. > r  J  '  •  },A  •  ' 


f  This  paper  is  dedicated  to  the  memory  of  James  H.  Wilkinson,  whose  unfailing  insight  and 
clarity  of  exposition  remain  an  inspiration  to  us. 

The  material  contained  in  this  report  is  based  upon  research  supported  by  the  Air  Force  Office  of 
Scientific  Research  Grant  87-01962;  the  U.S.  Department  of  Energy  Grant  DE-FG03-87ER25030; 
National  Science  Foundation  Grant  CCR-8413211;  and  the  Office  of  Naval  Research  Contract 
N00014-87-K-0142. 
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1.  Background  on  Quadratic  Programming 

1.1.  Statement  of  the  problem.  The  topic  of  concern  is  the  quadratic  programming  (QP) 
problem  of  minimizing  a  quadratic  objective  function  subject  to  linear  constraints  on  the  variables. 
Quadratic  programs  may  be  stated  in  several  (equivalent)  forms.  We  shall  consider  primarily 
quadratic  programs  in  the  following  standard  form : 


minimize  cTx  +  \xTHx 

*€»*  2  (1.1) 

subject  to  Ax  =  6,  l  <  x  <  u, 

where  the  Hessian  matrix  H  is  symmetric  and  /lismxn.  Components  of  t  and  u  may  be  taken 
as  —oo  and  -f-oo  if  no  bound  is  present.  We  assume  throughout  that  A  has  full  row  rank.  The 
constraints  Ax  =  b  are  called  the  general  constraints  of  (1.1).  We  define  the  (linear)  function 
g(x)  as  c  +  II  x,  the  gradient  of  the  quadratic  objective  function. 

The  term  “standard  form”  refers  to  the  constraints  in  (1.1),  and  means  that  the  only  in¬ 
equality  constraints  are  simple  bounds  on  the  variables.  (Section  6  treats  some  of  the  issues  that 
arise  with  alternative  formulations.)  In  much  of  our  discussion,  we  shall  treat  all  the  variables  of 
(1.1)  uniformly.  On  some  occasions,  however,  the  “original”  variables  of  a  quadratic  program  will 
be  distinguished  from  its  “slack”  variables.  A  quadratic  program  will  contain  slack  variables  if  its 
“natural”  formulation  includes  general  inequality  constraints.  For  example,  a  general  inequality 
constraint  ujx  >  /?<  is  replaced  by  the  equality  constraint  ajx  4-  s,  =  /?*,  and  the  standard-form 
version  of  the  problem  includes  an  additional  slack  variable  subject  to  the  bound  s,  <  0.  Slack 
variables  have  many  special  features;  one  of  particular  importance  is  that  they  do  not  appear  in 
the  objective  function.  This  paper  is  concerned  only  with  problems  in  which  the  Hessian  matrix 
II  in  the  original  variables  is  positive  definite.  The  indefinite  case  will  be  treated  in  a  forthcoming 
paper. 

Our  interest  in  sparse  quadratic  programs  arises  in  large  part  from  the  desire  to  apply 
sequential  quadratic  programming  (SQP)  methods  to  large  nonlinearly  constrained  problems.  In 
an  SQP  method,  each  iteration  involves  solution  of  a  quadratic  programming  subproblem,  which 
itself  must  be  solved  by  an  iterative  procedure.  An  important  feature  of  these  subproblems  is 
that  information  from  each  can  be  exploited  to  solve  the  next  more  quickly,  to  the  extent  that 
later  subproblems  usually  require  only  a  single  iteration  (see  Gill,  Murray,  Saunders  and  Wright, 
1985).  Thus,  the  first  QP  iteration  comprises  a  substantial  proportion  of  the  total  effort,  which 
implies  that  initialization  of  the  QP  algorithm  is  just  as  critical  as  subsequent  iterations. 

Sections  2-5  describe  a  new  method  (the  Schur-complement  or  SC  method)  for  quadratic 
programming.  Before  giving  details  of  the  SC  method,  Section  1.2  introduces  some  notation,  and 
Section  1.3  gives  a  condensed  overview  of  active-set  quadratic  programming  methods. 

1.2.  Notation.  The  proposed  method  is  iterative,  and  we  usually  consider  a  single  (typical) 
iteration.  Unbarred  and  barred  symbols  will  be  used  to  denote  quantities  associated  with  itera¬ 
tions  k  and  k  +  1.  The  only  exception  is  the  use  of  the  suffix  “0”  to  denote  quantities  associated 
with  the  first  iteration. 

We  shall  make  extensive  use  of  properties  of  the  inertia  of  a  matrix  A',  denoted  by  In(A'), 
which  is  an  integer  triple  (nji, 7),  where  «,  fi  and  7  are  the  numbers  of  positive,  negative  and 
zero  eigenvalues  of  A'. 
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with  A/  nonsingular,  the  Schur  complement  of  M  in  A'  will  be  denoted  by  K/M ,  and  is  defined 
as 

A'/A/  =  G  -  N\1-'Nt. 

We  sometimes  refer  simply  to  “the”  Schur  complement  when  the  relevant  matrices  are  clear.  (For 
further  discussion  of  the  Schur  complement,  see  Cottle,  1974.) 

1.3.  Background  on  active-set  methods.  The  Schur-complement  method  is  a  primal- 
feasible*  active-set  method.  For  an  overview,  see,  e.g.,  Fletcher  (1981).  Each  iteration  has  the 
following  general  structure:  given  the  current  iterate  x ,  the  next  iterate  is  defined  by 


x  =  x  +  otp , 


(1.2) 


where  the  vector  p  is  the  search  direction,  and  the  nonnegative  scalar  a  is  the  steplengtli.  An 
initial  feasibility  phase  is  performed  to  find  a  point  that  satisfies  the  constraints  of  (1.1)  (see 
Section  4),  and  all  iterates  are  thereafter  constructed  to  retain  feasibility. 

A  major  question  in  solving  (1.1)  is  the  identification  of  the  active  set  of  constraints,  namely 
the  constraints  that  hold  with  equality  at  the  solution.  Because  (1.1)  is  in  standard  form,  the 
active  set  must  contain  the  general  constraints,  plus  the  set  of  variables  that  lie  on  one  of  their 
bounds  at  the  solution.  An  active-set  method  maintains  an  estimate  of  the  active  set  (called  the 
working  set),  which  is  a  linearly  independent  set  of  constraints  that  are  satisfied  exactly  at  the 
beginning  of  each  iteration.  The  matrix  of  coefficients  of  constraints  in  the  working  set  will  be 
denoted  by  Aw ,  and  always  includes  the  equality  constraints.  Thus,  a  typical  working  set  has 
the  form 

Aw  =  ii,)' 

where  IT  contains  rows  of  the  identity  corresponding  to  variables  currently  on  their  bounds.  The 
constraints  in  the  working  set  are  (temporarily)  treated  as  equalities  during  the  current  iteration. 

The  search  direction  p  is  defined  as  the  solution  of  the  following  equality-constrained  QP: 


minimize  9TP+\pTIlp 

p€»" 

subject  to  .4,vp=0, 


(1.3) 


where  tj  denotes  g(x).  The  constraints  Awp  =  0  ensure  that  constraints  in  the  working  set  remain 
unaltered  by  any  move  along  p.  In  particular,  the  components  of  p  corresponding  to  bounds  in 
thp  working  set  (“active”  bounds)  must  be  zero.  The  solution  of  (1.3)  is  the  step  from  .r  to 
the  minimizer  of  the  quadratic  objective  function  of  (1.1),  subject  to  treating  the  w'orking  set  as 
equalities.  The  optimality  and  feasibility  conditions  for  (1.3)  are  expressed  by  the  linear  system 


(I  *)(-;)-(')■ 


1.4) 


where  ft  is  the  Lagrange  multiplier  vector  for  the  constraints  of  (1.3). 
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Almost  all  active-set  feasible-point  methods  for  convex  quadratic  programming  are  mathe¬ 
matically  identical  in  the  sense  that,  under  certain  conditions,  the  same  sequence  of  iterates  is 
generated  (see  Djang,  1980,  and  Best,  1984).  Differences  in  efficiency  and  numerical  stability 
arise  from  the  techniques  chosen  for  solving  (1.4).  Null-spa.ee  methods  are  based  on  computing 
either  implicitly  or  explicitly  a  (nonunique)  matrix  Z  whose  columns  span  the  null  space  of  Aw. 
The  solution  of  (1.4)  can  then  be  computed  as  p  =  Zpz,  where  pz  satisfies 

ZTHZpz  =  -ZTg.  (1.5) 

The  matrix  ZT  H Z  is  called  the  projected  Hessian.  We  stress  that  the  projected  Hessian  depends 
on  the  choice  of  Aw  as  well  as  on  the  representation  of  Z. 

If  p  is  nonzero,  two  situations  are  possible.  The  point  x  +  p  may  violate  one  or  more  currently 
inactive  bounds.  (In  this  case,  Aw  cannot  be  the  correct  active  set.)  Feasibility  is  retained  by 
determining  the  maximum  nonnegative  step  a  <  1  such  that  x  +  ap  is  feasible.  The  bound  that 
becomes  satisfied  exactly  at  x  +  ap  is  then  “added”  to  the  working  set  for  the  next  iteration  by 
adding  a  row  of  the  identity  matrix  to  Aw . 

Otherwise,  x  -F  p  is  feasible,  and  x  =  x  +  p.  Since  p  is  the  step  to  the  minimizer  of  (1.3),  it 
must  hold  that  ZT g[x)  —  0,  which  implies  that  g(x)  =  A^p  for  some  Lagrange  multiplier  vector 
p.  If  the  components  of  p  corresponding  to  active  lower  bounds  are  nonnegative,  and  those 
corresponding  to  active  upper  bounds  are  nonpositive,  then  x  is  the  (unique)  solution  of  (1.1). 
Otherwise,  there  is  at  least  one  component  with  the  “wrong”  sign,  which  means  that  deleting 
the  corresponding  constraint  from  the  working  set  will  produce  a  feasible  direction  of  descent  for 
the  objective  function.  (The  same  interpretation  applies  if  p  =  0:  either  x  is  optimal  for  (1.1), 
or  a  constraint  can  bo  deleted  from  the  working  set.)  When  a  bound  constraint  is  deleted,  the 
associated  variable  is  said  to  be  “freed”  from  its  bound,  and  one  of  the  rows  of  Ix  is  removed 
from  Aw . 

The  standard  convergence  properties  of  this  algorithm  are  summarized  by  the  following  two 
theorems,  which  are  stated  without  proof  (see,  e.g.,  Gill  and  Murray,  1978;  Fletcher,  1981). 

Theorem  1.1.  (Linear  independence  of  the  working  set).  If  the  initial  working  set  is  chosen  so 
that  ,4o  has  full  row  rank,  and  if  Awp  =  0  at  all  subsequent  iterations,  then:  (i)  every  working 
set  has  full  row  ran k;  and  (ii)  the  projected  Hessian  is  positive  definite  at  every  iteration.  | 

Theorem  1.2.  Assume  that  the  feasible  region  of  (1.1)  has  no  degenerate  vertices,  i.e.,  the  set 
of  constraints  defining  every  vertex  is  linearly  independent.  Then  the  feasible-point  active-set 
method  described  above  will  terminate  at  the  unique  minimizer  of  (1.1)  in  a  finite  number  of 
iterations.  | 

if  degenerate  vertices  exist,  additional  procedures  should  be  included  in  the  algorithm  to 
prevent  cycling,  i.e.,  making  an  infinite  number  of  changes  in  the  working  set  without  moving 
from  the  current  point.  Recent  techniques  for  treating  degeneracy  are  described  in.  for  example, 
Fletcher  ( 19X5),  Busovaca  (1985),  Dax  (1985),  Osborne  (1985),  Ryan  and  Osborne  (1986),  and 
Gill  et  a I.  (1987b). 

1.3.  Special  properties  of  the  standard  form.  So  far,  we  have  discussed  the  role  of 
the  matrix  Aw  without  particular  attention  to  the  computational  advantages  that  arise  when 
the  problem  is  in  standard  form.  Standard  form  allows  the  nonzero  (“free”)  components  of  the 
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search  direction  to  be  computed  using  a  matrix  whose  column  dimension  is  equal  to  the  number  of 
free  variables  (rather  than  the  total  lumber  of  variables).  To  formalize  this  idea,  let  nFR  denote 
the  number  of  free  variables  (i.e.,  corresponding  to  bounds  not  in  .4W),  and  let  the  subscript 
"fr"  denote  the  corresponding  components  of  a  vector  or  matrix.  For  example,  AFR  denotes 
the  m  x  nFR  submatrix  of  columns  of  A  corresponding  to  free  variables.  Similarly,  the  subscript 
"F.v"  means  the  components  corresponding  to  fixed  variables  (i.e.,  those  whose  bounds  are  in  the 
working  set).  (We  shall  henceforth  switch  freely  between  the  terminologies  of  “working  sets"  and 
“free/fixed  variables".  In  general,  the  main  iteration  will  be  described  in  terms  of  changes  to  the 
working  set  because  the  structure  of  the  constraints  does  not  affect  the  algorithm  at  that  level.) 

The  vector  pFR  satisfies  the  linear  system 

H fr  ^  ^  —  Pfr  ^  ^  ~Pfr  ^  ^  Off 

wiiere  ;r  denotes  the  vector  of  multipliers  for  the  general  equality  constraints.  The  reduced 
gradient  vector  A  for  the  fixed  variables  (i.e.,  the  Lagrange  multiplier  associated  with  active 
bounds)  may  be  computed  as  A  =  rjFX  —  Ajxn,  where  g  denotes  g(x  +  p). 

Equation  (1.6)  is  called  the  Kuhn-Tucker  system  or  just  the  KT  system.  The  following  lemma 
characterizes  the  relationship  between  the  eigenvalues  of  A’  and  the  eigenvalues  of  the  projected 
Hessian  ZJllZ. 

Lemma  1.1.  Let  M  be  aim  x  n  symmetric  matrix  and  N  an  rn  X  n  matrix  of  full  row  rank.  If 
Z  is  an  n  x  (n  -  m)  matrix  such  that  N Z  =  0,  then 


(1-6) 


In 


In(ZT MZ)  -f  (m,m,0). 


Proof.  For  a  proof,  see  Could  (1985).  B 

The  inertia  of  W  can  be  deduced  by  applying  Lemma  1.1  to  (1.6)  and  invoking  Theorem  1.1 
to  show  that  the  projected  Hessian  is  positive  definite;  thus,  we  conclude  that  In(A’)  =  (n,m,0). 

1.4.  Special  features  of  large  quadratic  programs.  Techniques  for  obtaining  a  null-space 
basis  Z  explicitly  or  implicitly  have  been  extensively  studied  recently,  with  particular  reference  to 
continuity  (see,  e.g.,  Coleman  and  Sorensen,  1981;  (Jill,  Murray,  Saunders,  Stewart  and  Wright, 
19x5).  When  A  is  dense.  Z  is  usually  computed  directly  from  a  QR  factorization  of  A.  When  A 
is  sparse,  however,  known  techniques  for  obtaining  an  orthogonal  and  sparse  Z  may  be  expensive 
in  time  and  storage,  although  some  recent,  approaches  appear  promising  (see,  e.g.,  Coleman  and 
Pot  lien ,  19*6;  Cilbert  and  Heat  h,  1986). 

I  he  representation  of  Z  most  commonly  used  in  sparse  problems  is  called  the  reduced- 
gradient  form  of  Z.  and  is  obtained  as  follows.  The  columns  of  A FR  are  partitioned  so  as  to 
identify  explicitly  an  m  x  rn  uonsingular  matrix  B  (the  basis).  Assuming  that  B  is  at  the  “left" 
of  A  f  „ ,  we  have 

A„=(B  S).  (1.7) 

(In  practice,  the  columns  of  B  may  occur  anywhere.) 


iV\v 
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When  AFR  has  the  form  (1.7),  a  basis  for  the  null  space  of  is  given  by  the  columns  of 
the  (non-orthogonal)  matrix  ZFR  defined  as 

ZF  *=(  Bj  ^  >  so  that  Z  =  .  (1.8) 

Furthermore, 

ZTfrHfrZfr  =  zthz. 

Let  nz  —  n  -  (m  +  nFX),  so  that  ZFR  has  nz  columns.  The  form  of  (1.8)  means  that  matrix- 
vector  products  ZTv  or  Zv  can  be  computed  using  a  factorization  of  B  (typically,  a  sparse  LU 
factorization;  see  Gill  et  a L,  1987a),  and  Z  need  not  be  stored  explicitly. 

For  large  problems,  the  projected  Hessian  ZT HZ  associated  with  the  solution  of  (1.6)  will 
generally  be  much  denser  than  II  and  B.  If  nz  is  small  enough  to  allow  the  storage  of  a  dense 
matrix  of  dimension  nz  x  n2,  the  null-space  basis  provided  by  (1.8)  is  very  effective  in  methods 
that  approximate  ZT II Z  (see  Murtagh  and  Saunders,  1978). 


2.  The  Schur-Complement  Quadratic  Programming  Method 

When  ,4  and  H  are  large  and  sparse,  a  single  system  of  the  form  (1.6)  can  be  solved  reliably 
and  efficiently  with  the  sparse  matrix  package  MA27  (see  Duff  and  Reid,  1982;  Duff,  Erisman 
and  Reid,  1986).  In  an  active-set  QP  method,  however,  a  sequence  of  such  systems  must  be 
solved,  each  differing  from  the  preceding  by  a  single  row  and  column.  In  a  method  based  on  a 
straightforward  interpretation  of  (1.6),  the  search  direction  p  and  multiplier  p  would  be  computed 
from  a  KT  system  that  varies  in  composition  and  size  as  the  working  set  changes.  In  contrast,  we 
now  show  that  the  special  nature  of  these  changes  allows  us  to  define  a  QP  algorithm  in  which 
the  solution  of  (1.6)  may  be  obtained  during  k  successive  iterations  using  a  fixed  factorization  of 
the  initial  KT  system,  and  a  factorization  of  a  smaller  matrix  of  (at  most)  order  k. 


2.1.  Computation  of  the  search  vector  and  multipliers.  To  illustrate  an  iteration  of  the 
Schur-complement  method,  we  first  consider  an  example  with  4  variables  and  a  single  general 
constraint,  where  bounds  2  and  4  are  in  the  initial  working  set.  Thus,  p2  =  Pi  =  0,  and  the  initial 
KT  system  (1.6)  is 


f  h\\ 

h  13 

au  ^ 

1  -V\  \ 

h\a 

^33 

a  i3 
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9a 

\flii 

«13 

o  ) 

\  *i/ 
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At  the  next  iteration,  suppose  that  the  first  variable  is  to  be  fixed  on  a  bound,  so  that  p\  =  0. 
It  is  easy  to  verify  that  p  satisfying  ( 1.6)  for  the  revised  working  set  satisfies 


(  fin  fi  13  «11 

1  \ 

1 

t-p\\ 
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/9i\ 

fi  1 3  fi.33  « 13 
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9a 
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0 
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\  1  0  0 

0  ) 

\  > 
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where  A1  is  the  reduced  gradient  for  the  newly  fixed  variable. 
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If,  on  the  other  hand,  variable  2  is  to  be  freed  from  its  bound  at  the  next  iteration,  the 
desired  p  satisfies 


i  /ill 
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The  general  rule  is  that  the  previous  KT  system  is  bordered  by  a  row  and  column  of  the 
identity  when  a  variable  is  fixed,  and  by  the  free  elements  of  a  row  and  column  of  II  and  a  column 
of  ,4  when  a  variable  is  freed. 

The  above  process  can  be  repeated  in  an  obvious  way  over  a  sequence  of  iterations.  Let 
t0  denote  the  initial  point  of  the  sequence,  A'o  the  KT  system  at  t0,  and  no  the  number  of 
free  variables  at  To-  Assume  that  k  changes  to  the  working  set  have  taken  place  since  A'o  was 
factorized.  Let  /  denote  an  (no  +  m)- vector  whose  first  no  components  are  the  components  of 
the  current  gradient  corresponding  to  the  free  variables  at  .To.  and  whose  remaining  m  elements 
are  zero.  Let  the  Ar-vector  w  be  defined  as 


if,  =  /  sAx)  i 

l  0  c 


if  x s  was  freed  at  iteration  j; 
otherwise. 


Note  that  both  /  and  w  depend  on  the  current  iterate  x. 

After  k  iterations,  the  symmetric  bordered  system  to  be  solved  is  of  dimension  at  most  n0  +  A\ 
and  has  the  form 


(Sv)(:)-C). 


where  V  is  n0  x  k  and  V  is  k  x  k.  The  j- th  column  of  U  is  a  column  of  the  identity  if  a  variable 
was  fixed  at  the  j-th  iteration;  otherwise  it  contains  elements  from  H  and  A ,  as  described  above. 
The  nonzero  entries  of  V  are  elements  of  II.  (If  no  variables  have  been  freed,  U  contains  only 
columns  of  the  identity  and  V  is  zero.) 

The  vectors  y  and  2  must  be  “unscrambled”  to  obtain  p,  p  and  A.  The  first  n0  components 
of  y  are  the  elements  of  —p  corresponding  to  the  free  variables  at  To-  The  remaining  nonzero 
elements  of  p  and  the  reduced  gradients  for  the  newly  fixed  variables  are  found  from  z  as  follows: 

(  X,  if  x,  was  fixed  at  iteration  j\ 

3  (  —p,  if  t,  was  freed  at  iteration  j. 

Since  (2.4)  (in  general)  increases  in  dimension  by  one  at  every  iteration,  it  might  appear  that 

there  is  no  benefit  from  this  approach.  However,  the  key  point  is  that  (2.4)  can  be  solved  using 

factorizations  of  A'o  and  C,  the  k  X  k  Schur  complement,  of  A'o: 

C  =  V-UTIiolU  (2.5) 

(see,  e.g.,  Bisschop  and  Meerhaus,  1977,  1980;  Gill  et  a/.,  1984).  The  following  equations  are 
solved  in  turn: 


A'o  v  =  f 
Cz  =  w  -  U  Tv 
A'o y  =  f  -  Uz. 


(2.6a) 

(2.6ft) 

(2.6c) 
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Thus,  the  work  required  to  perform  a  QP  iteration  is  dominated  by  two  solves  with  A'0  and  one 
solve  with  C.  If  k  is  small  enough,  dense  QR  or  LU  factors  of  C  may  be  maintained.  To  exploit 
symmetry,  the  symmetric  indefinite  factorization  (see  Section  5)  must  be  used.  Each  of  these 
factorizations  can  be  updated  efficiently  and  in  a  numerically  stable  manner  to  reflect  changes  in 
the  working  set;  see,  e.g.,  Gill  et  a I.  (1974),  Gill  et  a I.  (1987a)  and  Sorensen  (1977).  In  all  cases, 
the  numerical  stability  of  (2.6)  depends  largely  on  the  condition  of  Kq. 

Each  change  in  the  working  set  results  in  addition  of  a  new  row  and  column  to  C .  To  show 
this,  consider  a  single  change  in  the  working  set,  and  write  the  associated  KT  system  as 

(#?  [>)’  Where  °  =  (U  U)  and  V=(vT  l)'  (2-7) 

(The  definitions  of  u,  v  and  a  depend  on  the  nature  of  the  change  in  the  working  set.)  The  new 
Schur  complement  C  for  (2.7)  is  given  by 

C  =  V-VTh0-‘U  =  ^T  ;)-(^T)a0-,C  u).  (2.8) 

Comparison  of  (2.7)  and  (2.5)  reveals  that  the  Schur  complement  is  bordered  by  a  single  row  and 
column: 


where 

A'0 q  =  u,  t  =  v  -  UTq  and  7  =  a  -  uTq.  (2.9 b) 

Note  that  a  solve  with  A'o  is  needed  to  update  C . 

The  dimension  of  C  need  not  increase  if  a  variable  returns  to  its  original  status  during  the 
sequence  of  iterations.  For  example,  suppose  that  a  given  variable  is  fixed  at  the  initial  point, 
subsequently  freed,  and  then  later  fixed  again  at  either  the  same  or  opposite  bound.  (The  same 
comments  apply  if  a  variable  is  originally  free,  and  then  fixed  and  freed  again.)  To  effect  the 
second  change  in  the  working  set,  the  dimension  of  C  can  be  reduced  by  one,  by  simply  removing 
the  column  of  U  associated  with  the  first  change  and  then  modifying  C  to  “undo”  the  first  update. 
(It  is  easy  to  show  that  deleting  a  column  front  U  is  equivalent  to  deleting  a  row  and  column 
from  C.) 

In  Section  5.2  we  identify  the  special  relationship  of  the  Schur  complement  to  the  projected 
Hessian  in  certain  cases.  It  is  therefore  of  interest  to  know  the  inertia  of  the  Schur  complement, 
which  is  characterized  by  the  following  lemma. 

Lemma  2.1.  Consider  an  iteration  of  a  feasible-point  active-set  method  in  which  p  and  p  are 
corn fiiited  from  (2.6).  If  if  x  of  the  fixed  variables  were  originally  free  at  xq,  and  iFR  of  the  free 
variables  were  originally  fixed  at  Xq  ,  then 

1 11(6  )  =  ( i FR ,  tfx , 0). 


Proof.  The  bordered  matrix  (2.1)  may  be  permuted  (symmetrically)  to  a  matrix  M  of  the  form 
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where  K  is  formed  from  iFX  rows  of  the  identity  matrix.  The  elements  of  G  and  A  are  formed 
from  variables  in  two  categories:  those  that  were  free  at  (and  therefore  in  A’o)  and  those  that 
were  freed  in  subsequent  iterations.  Let  u  denote  the  dimension  of  G. 

The  inertia  of  C  (2.9)  after  a  single  change  in  the  working  set  is  given  by 

In(C)  =  111(0  +  Infryo, 

and  we  may  also  write 

In(C7C)  =  In (Al/M)  =  In(  A/  )  -  In(.U). 

Since  the  projected  Hessian  matrices  ZT II Z  and  Z1  HZ  are  positive  definite.  Lemma  1.1  implies 
that 

In(  M)  -  In(  M  )  =  (P,m  +  tV.v.  0)  -  (ia  m  +  iFX,  0). 

If  ,\I  is  expanded  by  fixing  a  variable  on  its  bound,  the  dimension  of  G  remains  the  same,  but  E 
is  expanded  by  a  single  row  (a  coordinate  vector).  Therefore,  0  =  v,  lFX  =  iFX  +  1  and 

In(C)  =  In(C)  +  (0,1,0). 

Similarly,  if  a  variable  is  freed  from  its  bound.  0  =  u  +  1,  tFX  =  iFX  and 

In(C)  =  In(C)  +  (1,0.0). 

The  desired  result  follows  by  applying  the  arguments  above  to  each  expansion  of  the  Scliur 
complement.  | 

2.2.  Refactorization.  As  the  dimension  of  C  grows,  the  work  needed  to  solve  (2.6)  increases, 
as  does  the  required  storage.  It  is  therefore  necessary  to  “restart”  at  a  “new”  x0,  and  to  fac¬ 
torize  the  current  KT  system  from  scratch  (as  in  linear  programming,  where  the  current  basis  is 
refactori/.ed.) 

Typically,  the  dimension  of  C  is  allowed  to  reach  a  specified  limit  (say,  100)  before  refactor¬ 
ization.  If  the  QP  is  a  “later”  subproblem  in  an  SQP  method,  the  solution  is  likely  to  be  found 
before  refactorization  is  required.  The  exact  point  at  which  refactorization  becomes  worthwhile 
depends  on  the  problem.  In  general,  the  decision  to  refactorize  is  guided  by  considerations  similar 
to  those  in  the  simplex  method  for  linear  programming — i.e.,  it  is  probably  desirable  to  refactorize 
when  the  cost  of  an  iteration  exceeds  an  average  figure  determined  by  amortizing  the  cost  of  the 
initial  factorization  over  a  number  of  iterations.  Refactorization  may  also  be  mandated  by  a  lack 
of  storage. 

Refactorization  provides  an  opportunity  to  check  for  any  possible  deterioration  in  feasibility 
through  the  accumulation  of  rounding  errors,  by  computing  the  row  residuals  b  -  ,4.ro  for  the 
general  constraints.  If  xn  is  unacceptable  because  of  large  row  errors  (i.e.,  large  residuals  in  the 
general  constraints),  one  or  two  steps  of  iterative  refinement  may  be  helpful  (see,  e.g.,  Wilkinson. 
190.j;  B  jbrck.  19S7).  (’nfortunatelv,  iterative  refinement  can  cause  some  of  the  variables  to  violate 
iie-ir  bounds.  It  is  therefore  essential  for  any  application  of  iterative  improvement  to  include  a 
procedure  for  restoring  feasibility  with  respect  to  the  bounds  (see  Section  4).  Because  of  rounding 
errors,  t  he  possibility  of  cycling  during  this  process  cannot  be  completely  eliminated.  For  example, 
she  algorithm  mtihl  cycle  forever  between  points  that  alternately  violate  the  bounds  and  general 
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constraints.  However,  the  changes  to  the  variables  caused  by  refinement  tend  to  be  small,  and 
cycling  is  unlikely. 

Ill-conditioning  in  A'o  may  lead  to  serious  error  in  the  computed  solution  of  the  bordered 
system  (2.4).  However,  as  each  new  variable  is  fixed  on  its  bound,  the  KT  system  may  become 
better  conditioned  and  merit  refactorization.  In  some  circumstances,  refactorization  can  be  post¬ 
poned  by  applying  iterative  refinement  on  both  A'o  and  C  whenever  Ap  has  drifted  away  from 
zero.  (This  form  of  iterative  refinement  is  unlikely  to  cause  loss  of  feasibility  with  respect  to 
the  bounds.)  Another  alternative  is  to  use  deflated  block  elimination  (see  Chan,  1984;  Chan  and 
Grossi,  1985). 


3.  Avoiding  a  Solve  with  A'o 

When  solving  (2.4),  the  calculations  can  be  rearranged  so  that  only  one  solve  with  A'o  is  needed 
at  every  iteration  (in  contrast  to  the  two  solves  in  (2.6)).  The  “trick”  is  to  define  the  right-hand 
side  of  (2.4)  so  that  certain  components  do  not  change.  This  can  be  accomplished  by  computing 
the  step  q  from  xo  to  the  minimizer  of  (1.3),  rather  than  the  step  p  from  x,  i.e.,  q  satisfies 

x0  +  q  =  x  +  p. 

To  illustrate  this  process,  we  reconsider  the  four-variable  example  of  Section  2.1.  Suppose 
that  the  next  iteration  involves  fixing  variable  1  at  its  lower  bound  l\ .  To  make  the  first  component 
of  .To  +  q  equal  to  £lt  q  must  satisfy 
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where  g(x o)  denotes  the  quadratic  objective  gradient  at  xo-  It  is  easy  to  verify  that  the  first 
component  of  xo  +  q  is  given  by 

(*o  +  q) l  =  (*o)i  +  <7i  =  (zo)i  -  (*o  -  t)i  =  t\. 


as  required. 

On  the  other  hand,  if  variable  2  is  to  be  freed  at  the  next  iteration,  then  q  satisfies 
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To  carry  out  this  strategy,  let  /o  denote  the  vector  /  from  Section  2.1  evaluated  at  xo,  and 
let  the  A:-vector  w  be  defined  as 


Wj 

\ 

I 


(  (xo  -  x),  if  x,  was  fixed  at  iteration  j; 
l  Os(i'o)  if  A**  was  freed  at  iteration  j. 
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The  value  (xo  -  .r)^  is  defined  as  Wj  when  variable  s  is  fixed  at  iteration  k  because  xs  will  then 
equal  or  u ,  (depending  on  which  bound  is  in  the  working  set). 

In  general,  given  /o  and  w,  the  search  direction  and  multiplier  vector  may  be  found  from  the 
solution  of 

(£:•)(:)-(!)■ 

Examination  of  (2.6a)  shows  that  the  right-hand  side  associated  with  (3.2)  is  constant ,  so  that 
the  solution  of  A'qi’o  =  /o  needs  to  be  computed  only  once,  at  the  first  iteration.  Thereafter. 
(2.6b)  is  simply 

C:  -  w  —  E'V 

Only  one  of  the  properties  mentioned  in  Section  2  does  not  apply  to  this  modified  iteration. 
The  exception  occurs  when  a  variable  moves  from  one  bound  to  its  “opposite”  bound.  In  this  case, 
it  is  not  possible  to  decrease  the  dimension  of  the  Schur  complement  and  maintain  a  constant 
vector  /o,  since  deleting  the  column  of  U  has  the  effect  of  fixing  the  variable  at  the  original  bound 
instead  of  the  “new”  bound.  To  allow  for  this  special  case,  the  Schur  complement  is  expanded 
as  if  the  variable  had  originally  been  free  at  xo.  This  modification  is  a  special  case  of  (2.9)  with 
the  values  u  =  0  and  v  =  es,  where  s  denotes  the  iteration  at  which  the  variable  became  free, 
thereby  adding  the  s-th  row  and  column  to  C.  The  new  Schur  complement  is  given  by 


and  it  is  not  necessary  to  compute  q,  t  or  7  in  (2.9). 

4.  Finding  an  Initial  Feasible  Point 

In  order  to  apply  the  active-set  algorithm  described  in  Section  1.3,  a  feasible  starting  point  is 
necessary.  In  the  dense  case,  problem  ( 1.1)  is  often  solved  in  two  phases.  The  first  (the  feasibility 
phase )  finds  a  feasible  point  by  minimizing  the  sum  of  infeasibilities;  the  second  (the  QP  phase) 
minimizes  the  quadratic  objective  function  in  the  feasible  region. 

In  a  null-space  method  for  large  quadratic  problems,  the  following  procedure  can  be  used 
to  find  a  feasible  point.  Given  a  basis  B  (a  nonsingular  m  x  m  submatrix  of  4),  a  point  x0 
can  computed  such  that  Axo  =  b,  and  then  tested  for  feasibility  with  respect  to  the  bounds.  If 
some  of  the  bounds  are  violated,  a  direction  can  be  computed  that  strictly  decreases  the  sum  of 
bound  violations,  yet  remains  “on”  the  general  constraints.  Once  a  variable  satisfies  its  bounds, 
it  is  not  allowed  to  become  infeasible  in  subsequent  iterations.  In  a  typical  null-space  method, 
both  the  feasibility  and  QP  phases  use  the  same  factorizations,  and  the  two-phase  nature  of  the 
algorithm  is  reflected  by  changing  the  function  being  minimized  from  the  sum  of  infeasibilities  to 
the  quadratic  objective  function  (see  Gill,  Murray,  Saunders  and  Wright,  1985). 

Unfortunately,  this  approach  will  be  inefficient  within  a  QP  algorithm  based  on  direct  solution 
of  the  KT  system,  since  the  factors  of  B  cannot  be  used  to  initiate  the  QP  phase.  The  inefficiency 
is  even  more  serious  when  the  QP  is  a  “later”  subproblem  within  an  SQP  method,  since  a  feasible 
point  for  the  QP  can  usually  be  obtained  directly  from  knowledge  of  the  active  set  in  the  previous 
subproblem.  If  no  iterations  are  required  to  find  a  feasible  point,  the  effort  required  to  factorize 
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the  basis  would  be  wasted.  Similar  inefficiences  occur  if  ill-conditioning  in  the  QP  phase  causes 
a  loss  of  feasibility. 

To  avoid  these  difficulties,  the  feasibility  phase  can  be  modified  so  that  it  attempts  to  reduce 
the  objective  function  while  simultaneously  improving  feasibility.  The  objective  function  in  the 
feasibility  phase  then  becomes  a  composite  objective  function  (a  weighted  sum  of  the  infeasibilities 
and  the  original  quadratic  objective  function).  With  this  approach,  the  search  direction  and 
multiplier  vector  satisfy  an  KT  system  similar  to  (1.6)  in  both  phases.  The  major  difference 
between  the  feasibility  and  QP  iterations  is  that  the  steplength  a  in  (1.2)  is  restricted  to  ensure 
that  the  number  of  violated  bounds  does  not  increase. 

An  alternative  strategy  for  the  feasibility  phase  has  been  suggested  in  the  single-phase  meth¬ 
ods  of  Hoyle  (1986).  In  this  case,  xq  is  chosen  to  satisfy  the  bound  constraints,  and  each  search 
direction  satisfies  a  system  of  the  form 


HTMr)’ 


where  r  -  Ax  -  b.  Unless  r  in  (4.1)  is  zero,  the  general  constraints  are  not  satisfied.  As  soon  as 
a  step  a  =  1  is  taken,  r  becomes  zero,  and  the  iterates  thereafter  satisfy  all  the  constraints. 


Neither  of  these  approaches  is  completely  satisfactory  for  the  SC  method.  With  a  composite 
objective  function,  the  gradient  vector  changes  in  a  discrete  fashion  as  variables  become  feasible 
and  so  must  be  recomputed  at  each  iteration,  which  makes  it  impossible  to  save  the  solve  with 
A'o  during  the  feasibility  phase  (see  Section  3).  If  we  apply  the  approach  based  on  solving  (4.1), 
Theorem  1.1  does  not  apply  as  long  as  r  is  nonzero,  and  fixing  a  variable  may  cause  AFR  to 
become  rank-deficient.  In  this  situation,  special  procedures  must  be  invoked  to  maintain  full 
rank  of  the  working  set. 

A  composite  objective  function  that  seems  well  suited  to  the  SC  method  involves  solving  a  QP 
subproblem  with  an  additional  variable  £  (the  artificial  variable)  associated  with  the  infeasibilities. 
As  in  the  approach  of  Hoyle,  the  bound  constraints  are  always  satisfied,  so  that  the  procedure 
may  begin  with  any  xo  satisfying  £  <  xo  <  u.  Let  ro  denote  the  residual  b  -  Ax o,  let  the  unit 
vector  s  be  defined  by  s  =  ro/||ro||,  and  let  fo  =  1-  Given  a  positive  weight  p,  we  solve  the 
modified  quadratic  program 


minimize  p£  +  cTx  +  ^ xrIIx 


subject  to  (  A  s)  ^ ^  =  b,  f  <  x  <  u,  0  <  £  <  1, 


for  which  (xq.^o)  is  feasible.  If  the  artificial  variable  ever  becomes  zero,  a  feasible  point  has  been 
found,  and  £  is  thereafter  excluded  from  the  problem. 

With  formulation  (4.2),  full  rank  of  the  initial  working  set  implies  full  rank  in  all  subsequent 
working  sets,  so  that  no  special  procedures  are  needed  to  correct  for  rank  deficiency.  Further,  the 
gradient  of  the  composite  objective  function  of  (4.2)  is  a  smooth  function  of  x.  If  p  is  sufficiently 
large,  the  solution  of  (4.2)  is  identical  to  that  of  the  original  problem  (1.1). 

Note  that  the  “artificial  column”  s  is  almost  always  dense.  Consequently,  £o  is  treated  as 
initially  fixed  on  its  upper  bound  so  that  it  is  not  included  in  A'q.  If  p  is  large  enough,  it  is 
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possible  to  ensure  that  £  will  be  the  first  variable  to  be  freed.  The  reduced  cost  for  the  upper 
bound  on  £  is  p  —  which  exceeds  the  reduced  cost  for  any  remaining  non-optimal  variable  if 

p  >  sTn  +  max  {  \gj  -  ajw\  :  j  6  J},  (4.3) 

where  g  =  g(x o  +  p)  and  J  is  the  index  set  of  non-optimal  fixed  variables.  The  lower  bound  in 
(4.3)  may  be  used  as  an  initial  estimate  of  p. 

If  the  artificial  variable  ever  moves  to  its  lower  bound  immediately  after  being  freed,  the 
Schur  complement  (a  single  element)  is  discarded.  This  situation  often  occurs  when  the  artificial 
variable  is  introduced  to  rectify  infeasibility  during  the  QP  phase. 

In  any  method  that  relies  on  a  composite  objective  function,  a  strategy  must  be  included  to 
attempt  to  decide  whether  no  feasible  point  exists  for  the  original  problem  (1.1),  or  whether  p 
has  not  yet  become  sufficiently  large.  (No  such  strategy  can  ever  be  guaranteed.)  The  value  of 
p  is  typically  increased  if  £  >  0  at  the  solution  of  (4.2) — for  example,  p  could  be  multiplied  by 
a  factor — but  eventually  the  algorithm  must  “give  up”  and  declare  that  the  constraints  of  (1.1) 
appear  to  be  infeasible.  A  strategy  that  increases  p  gradually  may  be  inefficient  if  only  a  single 
QP  is  solved,  since  many  values  of  p  may  be  required.  When  a  sequence  of  related  problems  is 
solved,  however,  the  value  of  p  from  one  QP  is  usually  a  satisfactory  choice  for  the  next. 

If  no  feasible  point  exists,  it  is  often  desirable  to  locate  the  minimium  sum  of  infeasibilities. 
Although  the  solution  of  (4.2)  does  minimize  a  weighted  sum  of  infeasibilities  if  p  is  large  enough, 
the  weights  are  essentially  arbitrary  (since  they  depend  on  the  initial  point). 


5.  Computing  the  Initial  Factorization 

The  matrix  A'o  is  represented  by  its  symmetric  indefinite  factorization  (see,  e.g.,  Bunch  and 
Parlett,  1971,  and  Bunch  and  Kaufman,  1977): 

A'o  =  LDLt,  (5.1) 

where  L  is  lower  triangular  and  D  is  block  diagonal,  with  1  X  1  or  2  X  2  blocks.  (The  latter  are 
required  to  retain  numerical  stability.) 

An  effective  and  widely  used  implementation  of  (5.1)  for  sparse  matrices  is  the  Harwell 
routine  MA27  (Duff  and  Reid,  1982),  which  is  a  three-phase  method.  The  ANALYZE  phase  is 
purely  symbolic  (i.e.,  uses  only  the  sparsity  pattern  of  A'o),  and  applies  a  version  of  the  minimum- 
degree  algorithm  intended  to  define  a  symmetric  ordering  that  produces  low  fill-in  in  L.  In  the 
subsequent  FACTORIZE  phase,  the  numerical  factors  (5.1)  are  computed  using  the  actual  entries 
in  A'o  ordered  as  prescribed  by  ANALYZE,  with  further  symmetric  interchanges  performed  if 
necessary  for  numerical  stability.  Finally,  the  solution  of  K$x  =  b  is  computed  in  the  SOLVE 
phase. 

5.1.  A  specialized  ANALYZE  phase.  The  direct  application  of  MA27  is  very  effective  for 
problems  in  which  II  is  sparse  and  there  are  few  general  constraints  (i.e.,  m  is  small  relative 
to  n).  Many  problems  in  statistics  have  this  feature,  since  they  require  nonnegativity  of  all 
variables,  with  the  single  general  constraint  JV  Xi  =  1.  KT  systems  with  small  m  are  best 
handled  by  applying  the  ANALYZE  phase  to  the  matrix  II  only.  A  suitable  ordering  for  the  full 
KT  system  may  then  be  determined  by  expanding  the  data  structure  to  include  the  constraint 
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rows  and  columns.  If  all  the  lxl  pivots  are  numerically  acceptable  in  the  FACTORIZE  phase, 
the  first  search  direction  and  multiplier  vector  would  satisfy  the  range-space  equations: 

~  AFrHfr9f r  and  HFrPFr  =  AfRir  —  gFR  (5-2) 

(see,  e.g.,  Gill  et  a L,  1982).  In  practice,  some  lxl  pivots  may  be  rejected — for  example,  any 
pivot  corresponding  to  a  free  slack  variable.  However,  if  m  is  not  too  large,  the  symbolic  ordering 
for  Hfr  is  still  likely  to  provide  a  numerically  stable  factorization. 

When  m  is  not  small  relative  to  n,  the  rows  of  AFR  must  be  included  in  the  ANALYZE 
phase.  Unfortunately,  the  minimum-degree  algorithm  assumes  that  no  2  x  2  pivots  occur  during 
the  factorization,  and  hence  that  the  diagonal  elements  of  A'o  are  nonzero.  Since  K0  always  has 
a  zero  diagonal  block  in  the  lower  right-hand  corner,  the  symbolic  ordering  from  ANALYZE  is 
often  changed  substantially  during  the  FACTORIZE  phase.  In  some  cases,  the  resulting  additional 
fill-in  increases  the  work  required  to  operate  with  the  factors.  The  problem  is  exacerbated  if  zero 
diagonal  elements  occur  within  H — for  example,  the  diagonals  corresponding  to  slack  variables. 

In  our  experience  with  large  m,  an  increase  in  fill-in  compared  to  the  predicted  level  has 
occurred  consistently,  even  when  all  the  diagonals  of  H  are  nonzero.  A  possible  explanation  is 
that  the  number  of  nonzeros  in  a  given  row  of  AFR  is  likely  to  be  less  than  the  total  number 
of  nonzeros  in  a  given  row  of  HFR  and  the  corresponding  column  of  AFR.  Consequently,  the 
minimum-degree  algorithm  may  well  choose  an  ordering  with  many  zero  diagonals.  The  root  of 
the  difficulty  seems  to  be  the  persistence  of  zero  diagonal  pivots  in  the  reduced  matrix. 

The  special  treatment  of  zero  diagonal  elements  during  the  minimum-degree  ordering  will  be 
incorporated  in  a  new  version  of  MA27  (Duff,  private  communication).  However,  the  efficiency 
of  the  current  version  of  MA27  on  KT  systems  may  be  significantly  improved  in  some  cases  by 
utilizing  the  facility  of  MA27  to  accept  a  preassigned  ordering  for  the  FACTORIZE.  Let  T  denote 
a  2  x  2  matrix  (called  a  tile)  of  the  form 


-c*)- 


where  h  is  an  element  of  H ,  and  a  and  d  are  elements  of  A.  A  symmetric  tile  has  the  useful  property 
that  it  is  nonsingular  if  a  is  nonzero;  moreover,  its  nonzero  eigenvalues  must  have  opposite  sign. 
Our  ordering  strategy  is  to  define  a  permutation  matrix  II  such  that  the  upper  left-hand  corner 
of  a  symmetrically  permuted  version  of  A'o  consists  of  a  symmetric  “checkerboard”  matrix  T  of 
tiles,  i.e., 

M  =  nTK0n  =  [F  Ej,  (5.4) 


where  T  has  the  form 


Til  T\i  T\3 

tT  qn  <r» 

M2  *  22  *  23 

t,T3  tT3  t3  3 
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Each  tile  (5.3)  is  essentially  a  pairing  of  a  column  of  II  with  a  column  of  A.  For  example,  when 
T  is  4  X  4,  one  possible  arrangement  is 


f  /ill 
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Once  an  element  a,j  is  selected  for  a  diagonal  tile  Tqq,  the  elements  of  column  j  of  A  become 
entries  of  the  off-diagonal  tiles  Tpq ,  where  p  =  1,2, ...,q  -  1.  The  remaining  elements  in  row  i  of 
A  become  ineligible  for  inclusion  in  diagonal  tiles. 

In  the  proposed  method,  the  sparsity  pattern  of  M  in  (5.4)  is  first  processed  by  a  minimum- 
degree  ordering  that  treats  each  tile  in  T  as  a  single  element  that  is  nonzero  if  and  only  if  the  tile 
is  a  nonzero  matrix.  A  suitable  ordering  for  the  full  matrix  is  then  determined  by  expanding  the 
data  structure  to  consider  each  tile  as  a  2  x  2  matrix.  The  idea  is  to  force  MA27  to  use  a  pivot 
strategy  contrary  to  that  typically  used  in  the  Bunch-Parlett  algorithm — i.e.,  instead  of  choosing 
2  x  2  pivots  only  when  no  suitable  diagonal  pivots  are  available,  2x2  pivots  are  preferred. 

Arranging  the  Hessian  and  general  constraints  of  (1.6)  into  tiles  is  a  very  effective  method 
for  dealing  with  slack  variables.  If  all  free  slacks  are  picked  first  for  inclusion  in  the  diagonal  tiles, 
the  associated  2x2  pivots  cause  no  fill-in  during  the  symmetric  indefinite  factorization. 

Many  real-world  problems  have  the  desirable  feature  that  a  “natural’'  pairing  of  columns  of 
If  and  A  is  suggested  by  the  nature  of  the  underlying  physical  system.  By  applying  the  above 
technique  in  the  ANALYZE  phase,  fill-in  during  the  factorization  may  be  substantially  reduced 
compared  to  an  application  of  MA27  alone.  In  Table  1  we  give  some  factorization  statistics 
obtained  from  applying  the  tiling  strategy  to  solve  QP  subproblcms  arising  in  the  optimal  dis¬ 
tribution  of  electrical  power  (Burchett,  llapp  and  Vierath,  1984).  For  each  problem  we  give  the 
dimension  of  the  projected  Hessian,  the  dimension  of  the  KT  system,  the  number  of  nonzeros  in 
A' o,  the  number  of  elements  in  L  predicted  by  the  ANALYZE,  and  the  actual  number  of  nonzeros 
generated  during  the  FACTORIZE. 

Table  1 

Factorization  statistics  for  various  KT  systems 


dim  (ZTIIZ) 


56 

266 

219 

1 

157 


ANALYZE 


13858 

31697 

33399 


63278 

65170 


FACTORIZE 


34676 

38458 

38700 

69323 

69620 


There  is  a  clear  need  for  a  general  procedure  that  will  find  a  suitable  tiling  for  an  arbitrary 
sparse  KT  system.  Clivon  any  2m  x  2m  tiled  matrix  T,  there  exists  a  permutation  77  such  that 

f  =  77t77/  =  ^  (5.5) 
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where  B  is  an  m  x  m  subset  of  the  columns  of  A.  Since  A  has  full  rank,  there  must  exist 
permutations  FI  and  77  for  which  B  is  nonsingular.  Hence  there  must  exist  at  least  one  tiling 
such  that  T  is  nonsingular.  Finding  a  “good”  set  of  columns  from  A  is  very  similar  to  the 
problem  of  choosing  a  nonsingular  basis.  Because  the  selected  columns  of  A  automatically  define 
the  rows  and  columns  of  77  to  be  used  in  T,  the  “best”  arrangement  will  minimize  the  number 
of  nonzero  tiles.  Once  an  initial  T  matrix  has  been  chosen,  updating  T  should  be  relatively 
easy  when  refactorization  is  required.  In  particular,  no  change  is  necessary  if  none  of  the  free 
variables  corresponding  to  the  selected  columns  of  77  has  become  fixed.  This  property  implies 
that  construction  of  the  tiles  should  favor  columns  of  77  that  are  likely  to  remain  free. 

5.2.  Relationship  between  methods.  If  the  KT  system  is  solved  by  taking  pivots  from 
llFR  first,  the  symmetric  indefinite  factorization  implicitly  forms  matrices  that  define  the  class 
of  range-space  methods  (cf.  (5.2)).  The  following  theorem  shows  that  a  different  pivot  order  will 
cause  the  symmetric  indefinite  factorization  to  form  matrices  associated  with  the  reduced-gradient 
method. 


Lemma  5.1.  Define  two  matrices  M  and  M  such  that 


-(?  c) 


and  M 


G  FT 


where  (I  and  G  are  k  x  k  and  nonsingular,  and  A/  is  obtained  from  M  by  performing  symmetric 
permutations  of  the  first  k  rows  and  columns.  Then  M /G  =  M/G.  | 


Theorem  5.1.  Assume  without  loss  of  generality  that  IF  FR  and  A  FR  may  be  partitioned  so  that 

tm  gt\  ,  .  .  „ 


f fr  —  ^  Q  jj  ^  and  Afh  —  {B  S), 


where  B  and  II i  are  m  x  rn  with  B  nonsingular.  Let  T  denote  a  2m  x  2 in  tiled  matrix  formed 
from  elements  of  H\  and  B.  If  M  denotes  the  permuted  KT  system  (5.4)  then 

M/T  =  ZTliZ ,  where  Z  =  I  7  1  . 


Proof.  I  ,et  M  and  7  denote  the  matrices 


A7  =  I  B 


//,  77 7  G 


Gr  S1  //. 


S  and  7 


Then  by  definition, 


-(*'  BT)' 


M/T  =  11,  -  (  Gt  S1' )  T- '  (  °s  )  =  II,  -  (  GT  ST  )  (  I’'  )  , 


where  l )  and  are  defined  by  the  (block-triangular)  equations 


(«•  wT)Q)=a)- 
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Substituting  for  l't  and  V->  from  (5.7)  into  (5.6).  we  obtain 

M/f  =  1U  -  CJB-'S  -  SrB~TG  +  ST B~t II]B~xS 

H'o 

The  result  now  follows  from  Lemma  5.1.  | 

This  result  implies  that  the  symmetric  indefinite  factorization,  when  used  in  conjunction  with 
a  nonsingular  tiling,  implicitly  forms  and  factorizes  the  projected  Hessian.  Moreover,  ZT // Z  is 
computed  in  a  way  that  exploits  symmetry.  In  situations  when  the  projected  Hessian  is  sparse,  the 
symmetric  indefinite  factorization  thus  provides  an  effective  means  of  exploiting  sparsity  during 
the  factorization. 

In  general,  direct  factorization  of  the  KT  system  provides  the  opportunity  to  define  a  set  of 
methods  for  the  large-scale  case,  with  each  method  determined  by  the  order  of  the  columns  in 
the  initial  I\T  factorization. 


6.  Formulation  of  the  Constraints 

In  this  section  we  discuss  two  aspects  of  the  occasional  inefficiency  resulting  from  use  of  the 
standard  form  (1.1). 

6.1.  Treatment  of  slacks  in  the  standard  form.  In  the  Schur-complement  algorithm,  each 
free  slack  column  adds  a  unit  row  and  column  to  h'o,  so  that  symmetric  interchanges  cannot 
move  the  corresponding  unit  element  to  the  diagonal.  In  order  to  avoid  unnecessary  fill-in  during 
the  initial  factorization,  the  two  (unit)  nonzeros  associated  with  each  slack  must  be  formed  into 
a  2  x  2  tile  as  described  in  Section  5.  If  there  are  k  free  slacks.  AFR  is  of  the  form 


=  ( 

\h  MJ 


(6.1) 


where  /<,  is  the  identity  of  order  k.  With  suitable  permutation,  the  associated  KT  system  has  the 


term 


Ao  = 


\ 


V 


(6.2) 
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A  trivial  sequence  of  interchanges  in  the  leading  2k  X  2k  rows  and  columns  of  (6.2)  gives  the 
required  riling,  with  k  matrices 

(,  ’) 

on  the  principal  diagonal.  If  2  x  2  pivots  are  selected  from  the  diagonal,  no  additional  fill-in 
occurs  in  the  leading  2k  columns.  Ideally,  a  symmetric  indefinite  solver  such  as  MA27  should 
treat  singleton  rows  and  columns  in  this  way. 

6.2.  General  inequalities.  Constraints  sometimes  arise  “naturally”  in  the  form 


<(')»<«. 


(6.3) 
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7.  Discussion 


In  an  active-set  method  based  on  (6.3),  the  working-set  matrix  undergoes  both  row  and  col¬ 
umn  changes.  For  sparse  problems,  few  authors  have  considered  the  associated  complications  of 
updating  sparse  factors  that  vary  in  dimension.  (See  Gill  et  a 1.,  1987a,  for  an  exception.) 

In  contrast,  the  Schur-complement  method  may  be  generalized  quite  readily  to  problems 
with  constraints  of  the  form  (G.3).  The  equations  associated  with  each  iteration  are  identical  to 
( l.G).  except  that  Ay„  is  effectively  just  the  submatrix  A\  in  (6.1).  If  a  general  constraint  with 
gradient  a  is  added  to  the  working  set,  the  KT  system  is  bordered  by  a  vector  made  up  from  the 
free  elements  of  a. 

The  update  for  deletion  of  a  general  constraint  from  the  working  set  can  be  illustrated  easily 
for  t he  first  iteration.  If  the  constraint  to  be  deleted  corresponds  to  the  s-th  row  of  Afr,  the 
search  direction  satisfies  the  bordered  system 

(  II FR  ARR  \  /  / Qfr  ^ 

r  "JUrl  J- 

where  -y  may  be  discarded.  The  Lagrange  multipliers  for  the  general  constraints  in  the  working 
set  may  be  recovered  by  deleting  the  s-th  element  of  z. 

Since  Afh  must  be  maintained  subject  to  both  row  and  column  updates,  it  is  necessary  to 
access  .1  by  both  rows  and  columns.  In  the  case  of  linear  programming,  if  m  <  n  the  standard 
form  is  efficient  (and  requires  only  column  updates),  while  if  m  >  n  one  may  prefer  to  solve  the 
dual  problem,  again  in  standard  form.  However,  once  the  form  (6.3)  is  assumed,  efficiency  can  be 
retained  regardless  of  the  ratio  of  m  to  n.  This  advantage  is  all  the  more  important  for  nonlinear 
problems,  where  the  device  of  solving  the  dual  is  not  necessarily  applicable  or  efficient. 


7.  Discussion 

An  important  feature  of  the  Schur-complement  approach  is  that  any  advances  in  methods  for 
.-.parse  linear  equations  are  immediately  applicable  to  computation  of  the  initial  factorization  of 
/do.  This  approach  is  especially  effective  when  A'o  has  special  structure  that  can  be  exploited  in  a 
"black  box"  equation  solver- -e.g.,  when  the  constraints  are  derived  from  network  flow  problems. 

Many  new  machines  have  become  available  in  recent  years  with  vector  and/or  parallel  ca- 
pabilit  ies.  In  most  cases,  the  novelty  of  their  architecture  is  not  exploited  by  existing  software. 
In  the  large-scale  mathematical  programming  area,  a  portable  Fortran  code  (e.g.,  MINOS,  SCI- 
( ‘ON  If)  will  run  successfully  on  vector  machines  like  the  CRAY-1  or  CRAY-XMP,  but  most  of 
the  computation  will  be  performed  only  in  scalar  mode. 

ii  can  be  expected  that  sparse  linear  equation  solvers  will  eventually  be  developed  for  novel 
machines  intended  for  scientific  computation.  While  explicit  updating  of  LU  factors  will  probably 
remain  efficient  on  conventional  machines  (see  Gill  et  a/.,  1987a),  the  Schur-complement  approach 
is  likely  to  provide  the  most  effective  method  for  machines  with  advanced  architectures.  In  the 
case  of  ve<  tor  supercomputers,  techniques  have  already  been  developed  that  allow  large  linear 
systems  to  be  solved  efficiently  (see  Ashcraft  et  a/.,  1987).  We  therefore  believe  that  the  efficient 
solution  of  large  quadratic  programming  problems  on  vector  machines  is  now  feasible. 

The  Schur  complement  method  described  here  has  been  implemented  within  an  SQP  method 
and  applied  to  the  solution  of  optimal  power  flow  (OPF)  problems  arising  in  the  electrical  power 
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industry.  OPF  problems  concern  the  optimal  generation  and  distribution  of  electrical  power  in 
a  network  (see  Stott,  Alsac  and  Marinho,  1980).  Exact  second  derivatives  are  available  for  these 
problems  and  a  specialized  Newton-based  SQP  method  has  been  applied  with  great  success  to 
general  OPF  problems  of  a  size  previously  considered  intractable. 


Acknowledgements 

We  would  like  to  thank  Rob  Burchett  for  numerous  discussions  on  large-scale  SQP  methods.  He 
also  provided  the  motivation  for  considering  the  Schur-complement  method.  We  are  grateful  to 
Nicholas  Higham  for  his  careful  reading  of  the  manuscript. 


References 

Ashcraft,  C.  C.,  Grimes,  R.  G.,  Lewis,  J.  G.,  Peyton,  B.  W.  and  Simon,  H.  D.  (1987).  Recent 
progress  in  sparse  matrix  methods  for  large  linear  systems  on  vector  supercomputers,  En¬ 
gineering  Technology  Applications  Report  ETA-TR-56,  Boeing  Computer  Services,  Seattle, 
Washington. 

Best,  M.  J.  (1984).  Equivalence  of  some  quadratic  programming  algorithms.  Mathematical  Pro¬ 
gramming  30,  71-87. 

Bisschop,  J.  and  Meeraus,  A.  (1977).  Matrix  augmentation  and  partitioning  in  the  updating  of 
the  basis  inverse.  Mathematical  Programming  13,  241-254. 

Bisschop,  J.  and  Meeraus,  A.  (1980).  Matrix  augmentation  and  structure  preservation  in  linearly 
constrained  control  problems,  Mathematical  Programming  18,  7-15. 

Bjorck,  A.  (1987).  Iterative  refinement  and  reliable  computing,  in  M.  G.  Cox  and  S.  J.  Hammarling 
(eds.).  Advances  in  Reliable  Numerical  Computing,  Oxford  University  Press. 

Bunch,  J.  R.  and  Kaufman,  L.  C.  (1977).  Some  stable  methods  for  calculating  inertia  and  solving 
symmetric  linear  equations,  Mathematics  of  Computation  31,  163-179. 

Bunch,  J.  R.  and  Parlett,  B.  N.  (1971).  Direct  methods  for  solving  symmetric  indefinite  systems 
of  linear  equations,  SIAM  J.  on  Numerical  Analysis  8,  639-655. 

Bunch,  J.  R.  and  Kaufman,  L.  C.  (1980).  A  computational  method  for  the  indefinite  quadratic 
programming  problem,  Linear  Algebra  and  its  Applications  34,  341-370. 

Burchett,  R.  C.,  Happ,  H.  H.  and  Wirgau,  I<.  (1982).  Large-scale  optimal  power  flow,  IEEE 
Transactions  on  Power  Apparatus  and  Systems,  3722-3732. 

Busovaca,  S.  (1985).  Handling  degeneracy  in  a  nonlinear  t\  algorithm,  Technical  Report  CS-85-34, 
University  of  Waterloo,  Waterloo,  Canada. 

Chan.  T.  F.  (1984).  Deflated  decomposition  of  solutions  of  nearly  singular  systems,  SIAM  J.  on 
Numerical  Analysis  21,  738-754. 

Chan,  T.  F.  and  Grossi,  T.  A.  (1985).  DBEPACK:  A  program  package  for  solving  bordered 
singular  systems,  Technical  Report  YALEU/DCS/RR-336  Yale  University. 

Coleman,  T.  F.  and  Pothen,  A.  (1986).  The  null  space  problem  I.  Complexity,  SIAM  J.  on 
Algebraic  and  Discrete  Methods  7,  527-537. 


1 


References 


19 


Coleman,  T.  F.  and  Sorensen,  D.  C.  (1984).  A  note  on  the  computation  of  an  orthogonal  basis 
for  the  null  space  of  a  matrix,  Mathematical  Programming  29,  234-242. 

Cottle,  R.  W.  (1974).  Manifestations  of  the  Schur  complement,  Linear  Algebra  and  its  Applica¬ 
tions  8,  189-211. 

Dax,  A.  (1985).  The  computation  of  descent  directions  at  degenerate  points,  Technical  report, 
Hydrological  Service,  PO  Box  6381,  Jerusalem,  Israel. 

Djang,  A.  (1980).  Algorithmic  Equivalence  in  Quadratic  Programming,  Ph.D.  Thesis,  Stanford 
University,  California. 

Duff,  I.  S.,  Erisman,  A.  M.  and  Reid,  J.  K.  (1986).  Direct  Methods  for  Sparse  Matrices,  Oxford 
University  Press,  Oxford,  New  York. 

Duir.  I.  S.  and  Reid,  J.  K.  (1982).  MA27 — a  set  of  Fortran  subroutines  for  solving  sparse  symmetric 
sets  of  linear  equations,  Report  AERE  R- 10533,  Computer  Science  and  Systems  Division, 
AF.RE  Harwell,  Oxfordshire,  England. 

Duff,  1.  S.  and  Reid,  J.  K.  (1984).  The  multifrontal  solution  of  unsymmetric  sets  of  linear  equa¬ 
tions,  SIAM  J.  on  Scientific  and  Statistical  Computing  5,  633-641. 

Fletcher.  R.  (1981).  Practical  Methods  of  Optimization,  Volume  2,  Constrained  Optimization, 
John  Wiley  and  Sons,  New  York  and  Toronto. 

Fletcher,  R.  (1985).  Degeneracy  in  the  presence  of  round-off  errors.  Numerical  Analysis  Report 
NA89,  Department  of  Mathematical  Sciences,  University  of  Dundee,  Scotland. 

Gilbert,  J.  R.  and  Heath,  M.  T.  (1986).  Computing  a  sparse  basis  for  the  null  space,  Report 
I  RS6-730,  Department  of  Computer  Science,  Cornell  University,  Ithaca,  New  York. 

Gill.  P.  E.,  Golub,  G.  H.,  Murray,  W.  and  Saunders,  M.  A.  (1974).  Methods  for  modifying  matrix 
factorizations.  Mathematics  of  Computation  28,  505-535. 

Gill.  P.  E.,  Gould,  N.  1.  M.,  Murray,  W.,  Saunders,  M.  A.  and  Wright,  M.  H.  (1982).  Range-space 
methods  for  convex  quadratic  programming.  Report  SOL  82-14,  Department  of  Operations 
Research,  Stanford  University. 

Gill,  P.  E.  and  Murray,  W.  (1978).  Numerically  stable  methods  for  quadratic  programming, 
Mathematical  Programming  14,  349-372. 

Gill,  P.  E..  Murray.  \V\,  Saunders,  M.  A.  and  Wright,  M.  11.  (1984).  Sparse  matrix  methods  in 
optimization,  SIAM  J.  on  Scientific  and  Statistical  Computing  5,  562-589. 

Gill.  P.  E..  Murray,  W..  Saunders,  M.  A.  and  Wright,  M.  H.  (1985).  Software  and  its  relationship 
to  methods,  in  P.  T.  Boggs,  R.  H.  Byrd  and  R.  B.  Schnabel  (eds.),  Numerical  Optimization 
WSt.  SIAM,  Philadelphia,  139  159. 

Gill.  P.  E.,  Murray,  W..  Saunders.  M.  A.  and  Wright,  M.  H.  (1987a).  Maintaining  LU  factors  of 
a  general  sparse  matrix.  Linear  Algebra  and  its  Applications  88/89,  239-270. 

Gill.  P.  E..  M  urray,  W..  Saunders,  M.  A.  and  Wright,  M.  H.  (1987b).  A  practical  anti-cycling  pro¬ 
cedure  for  linear  and  nonlinear  programming.  Report  SOL  87-13,  Department  of  Operations 
Research,  Stanford  University. 

j  Gill.  P.  E..  Murray.  W..  Saunders.  M . Stewart.  G.  W.  and  Wright,  M.  II.  (1985).  Properties 

of  a  representation  of  a  basis  for  the  null  space.  Mathematical  Programming  33,  172  186. 

Gill.  P.  I...  Murray.  W.  and  Wright.  M.  II.  (1981).  Practical  Optimization,  Academic  Press, 
London  and  New  York. 

i 

I 

! 

i 

i 

i 


20 


A  Schur-Complement  QP  Method 


Gould,  N.  I.  M.  (1985).  On  practical  conditions  for  the  existence  and  uniqueness  of  solutions 
to  the  general  equality  quadratic  programming  problem,  Mathematical  Programming  32, 
90-99. 

Hoyle,  S.  C.  (1986).  A  single-phase  method  for  quadratic  programming,  Report  SOL  86-9,  De¬ 
partment  of  Operations  Research,  Stanford  University. 

Murtagh,  B.  A.  and  Saunders,  M.  A.  (1978).  Large-scale  linearly  constrained  optimization,  Math¬ 
ematical  Programming  14,  41-72. 

Osborne,  M.  R.  (1985).  Finite  Algorithms  in  Optimization  and  Analysis ,  John  Wiley,  Chichester 
and  New  York. 

Ryan,  D.  M.  and  Osborne,  M.  R.  (1986).  On  the  solution  of  highly  degenerate  linear  programs, 
Technical  report,  Department  of  Statistics,  Australian  National  University. 

Sorensen,  D.  C.  (1977).  Updating  the  symmetric  indefinite  factorization  with  applications  in 
a  modified  Newton  method,  Report  ANL-77-49,  Argonne  National  Laboratory,  Argonne, 
Illinois. 

Stott,  B.,  Alsac,  O.  and  Marinho,  J.  L.  (1980).  The  optimal  power  flow  problem,  in  A.  M.  Erisman, 
K.  W.  Neves  and  M.  H.  Dwarakanath  (eds.),  Electric  Power  Problems:  The  Mathematical 
Challenge,  SIAM,  Philadelphia,  327-351. 

Wilkinson,  J.  H.  (1965).  The  Algebraic  Eigenvalue  Problem,  Oxford  University  Press. 


SECURITY  CLASSIFICATION 


ICC  (Hm  OataBMafMO 


REPORT  DOCUMENTATION  PAGE 


4.  TITLE  (<0*4  SmkHUm) 


A  Sehur-Complement  Method  for  Sparse 
Quadratic  Programming 


T.  AUTHORS 

Philip  E .  Gill,  Walter  Murray, 

Michael  A.  Saunders,  Margaret  H.  Wright 


•  PERFORMINO  OROAMIIATIOM  MAMC  AMO  AOORESS 

Department  of  Operations  Research  -  SOL 
Stanford  University 
Stanford,  CA  94305 


II.  COMT-ROLLINO  OFFICE  MAMC  AMO  ADORES! 

Of  rice  or  Naval  Research  -  Dept,  of  the  Navy 
80  J  N .  Quincy  Street 
Arlington,  VA  22217 


Air  Force  Ofice  of  Scientific  Research/NM 

Bu i 1 d i ng  4  1 0 
Bolling  Air  Force  Base 
Washington,  DC  2Q332 


4  DISTRIBUTION  STATEMENT  (ml  Mm  J»  apart} 


HEAD  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


»•  TVP*  OF  REPORT  B  PCMOO  COVERED  f 


Technical  Report 


ONTRACT  OR  ORAMT  NUMBENCa} 


N000 1 4-87-K-O 142 . 


AFOSR-87-01 962 


IS.  REPORT  DATS 

October  1987 


20  pp. 


IB-  SECURITY  CLAES,  (ml  BU*  i 

UNCLASSIFIED 


This  document  has  been  approved  for  public  release  and  sale; 
Its  distribution  Is  unlimited. 


IT  DISTRIBUTION  STATEMENT  (ml  *•  atatracl  mmlmmm*  kt  Blmmk  SB,  IIBHM1  Mm 


It.  KEY  VOROS  (CmmUnum  mm  rmrmtmm  ml 4m  It  mmmmmmmj  mm4  1 4mm  II  If  kf  MmI 


liige  scale  quadratic  programming,  supercomputers,  constrained 
opt i  ilzation,  sparse  matrix  methods,  Schur  complement. 


DO  i  jm  n  1473  |#'1 


>MOLSTE 


SECURITY  CLASStFlCATlC 


l(  (Wkmm  Data  InMral 


MCUWITY  CLAmUCATIQH  OB  ThU  BAOgfW>l«n  P«li  fcltwO 

Abstract 


In  applying  active-set  methods  to  sparse  quadratic  programs,  It  is 
desirable  to  utilize  existing  sparse-matrix  techniques.  We  describe  a 
quadratic  programming  method  based  on  the  classical  Schur  complement.  Its 
key  feature  is  that  much  of  the  linear  algebraic  work  associated  with  an 
entire  sequence  of  iterations  involves  a  fixed  sparse  factorization.  Updates 
are  performed  at  every  iteration  to  the  factorization  of  a  smaller  matrix, 
which  may  be  treated  as  dense  or  sparse. 

The  use  of  a  fixed  sparse  factorization  allows  an  "off-che  shelf"  snarse 
equation  solver  to  be  used  repeatedly.  This  feature  is  ideally  suited  to 
problems  with  structure  that  can  be  exploited  by  a  specialized 
factorization.  Moreover,  improvements  in  efficiency  derived  from  exploiting 
new  parallel  and  vector  computer  architectures  are  immediately  applicable. 

An  obvious  appliation  of  the  method  is  in  sequential  quadratic 
programming  methods  for  nonlinearly  constrained  optimization,  which  require 
«  solution  of  a  sequence  of  closely  related  quadratic  programming  subproblems, 
j  We  discuss  some  ways  in  which  the  known  relationship  between  successive 
I  problems  can  be  exploited. 
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